Self-organization and missing values in SOM and GTM
نویسندگان
چکیده
In this paper, we study fundamental properties of the Self-Organizing Map (SOM) and the Generative Topographic Mapping (GTM), ramifications of the initialization of the algorithms and properties of the algorithms in the presence of missing data. We show that the commonly used principal component analysis (PCA) initialization of the GTM does not guarantee good learning results with high-dimensional data. Initializing the GTM with the SOM is shown to yield improvements in self-organization with three high-dimensional data sets: commonly used MNIST and ISOLET data sets and epigenomic ENCODE data set. We also propose a revision of handling missing data to the batch SOM algorithm called the Imputation SOM and show that the new algorithm is more robust in the presence of missing data. We benchmark the performance of the topographic mappings in the missing value imputation task and conclude that there are better methods for this particular task. Finally, we announce a revised version of the SOM Toolbox for Matlab with added GTM functionality. & 2014 Elsevier B.V. All rights reserved.
منابع مشابه
Missing data imputation through Generative Topographic Mapping as a mixture of t - distributions : Theoretical developments
The Generative Topographic Mapping (GTM) was originally conceived as a probabilistic alternative to the well-known, neural network-inspired, Self-Organizing Map (SOM). The GTM can also be interpreted as a constrained mixture of distributions model. In recent years, much attention has been directed towards Student t-distributions as an alternative to Gaussians in mixture models due to their robu...
متن کاملExperimental Analysis of GTM
Not linear methods for statistical data analysis have become more and more popular thanks to the rapid development of computers. The fields in which they are applied to are as various as the methods them self. Generative topographic mapping (GTM) has been developed by [Bishop et al. 1997] as a principal alternative to the self-organizing map (SOM) algorithm [Kohonen 1982] in which a set of unla...
متن کاملS-Map: A Network with a Simple Self-Organization Algorithm for Generative Topographic Mappings
The S-Map is a network with a simple learning algorithm that combines the self-organization capability of the Self-Organizing Map (SOM) and the probabilistic interpretability of the Generative Topographic Mapping (GTM). The simulations suggest that the SMap algorithm has a stronger tendency to self-organize from random initial configuration than the GTM. The S-Map algorithm can be further simpl...
متن کاملDevelopments of the generative topographic mapping
The Generative Topographic Mapping (GTM) model was introduced by 7) as a probabilistic re-formulation of the self-organizing map (SOM). It offers a number of advantages compared with the standard SOM, and has already been used in a variety of applications. In this paper we report on several extensions of the GTM, including an incremental version of the EM algorithm for estimating the model para...
متن کاملMissing data imputation through GTM as a mixture of t-distributions
The Generative Topographic Mapping (GTM) was originally conceived as a probabilistic alternative to the well-known, neural network-inspired, Self-Organizing Maps. The GTM can also be interpreted as a constrained mixture of distribution models. In recent years, much attention has been directed towards Student t-distributions as an alternative to Gaussians in mixture models due to their robustnes...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Neurocomputing
دوره 147 شماره
صفحات -
تاریخ انتشار 2015